home *** CD-ROM | disk | FTP | other *** search
Wrap
SHMEM_SUM(3) SHMEM_SUM(3) NNNNAAAAMMMMEEEE sssshhhhmmmmeeeemmmm____ccccoooommmmpppp4444____ssssuuuummmm____ttttoooo____aaaallllllll,,,, sssshhhhmmmmeeeemmmm____ccccoooommmmpppp8888____ssssuuuummmm____ttttoooo____aaaallllllll, sssshhhhmmmmeeeemmmm____ccccoooommmmpppplllleeeexxxxdddd____ssssuuuummmm____ttttoooo____aaaallllllll, sssshhhhmmmmeeeemmmm____ccccoooommmmpppplllleeeexxxxffff____ssssuuuummmm____ttttoooo____aaaallllllll, sssshhhhmmmmeeeemmmm____ddddoooouuuubbbblllleeee____ssssuuuummmm____ttttoooo____aaaallllllll, sssshhhhmmmmeeeemmmm____ffffllllooooaaaatttt____ssssuuuummmm____ttttoooo____aaaallllllll, sssshhhhmmmmeeeemmmm____iiiinnnntttt____ssssuuuummmm____ttttoooo____aaaallllllll, sssshhhhmmmmeeeemmmm____iiiinnnntttt4444____ssssuuuummmm____ttttoooo____aaaallllllll, sssshhhhmmmmeeeemmmm____iiiinnnntttt8888____ssssuuuummmm____ttttoooo____aaaallllllll, sssshhhhmmmmeeeemmmm____lllloooonnnngggg____ssssuuuummmm____ttttoooo____aaaallllllll, sssshhhhmmmmeeeemmmm____lllloooonnnnggggddddoooouuuubbbblllleeee____ssssuuuummmm____ttttoooo____aaaallllllll, sssshhhhmmmmeeeemmmm____lllloooonnnngggglllloooonnnngggg____ssssuuuummmm____ttttoooo____aaaallllllll, sssshhhhmmmmeeeemmmm____rrrreeeeaaaallll4444____ssssuuuummmm____ttttoooo____aaaallllllll, sssshhhhmmmmeeeemmmm____rrrreeeeaaaallll8888____ssssuuuummmm____ttttoooo____aaaallllllll, sssshhhhmmmmeeeemmmm____rrrreeeeaaaallll11116666____ssssuuuummmm____ttttoooo____aaaallllllll, sssshhhhmmmmeeeemmmm____sssshhhhoooorrrrtttt____ssssuuuummmm____ttttoooo____aaaallllllll - Performs a sum reduction across a set of processing elements (PEs) SSSSYYYYNNNNOOOOPPPPSSSSIIIISSSS C or C++: ####iiiinnnncccclllluuuuddddeeee <<<<mmmmpppppppp////sssshhhhmmmmeeeemmmm....hhhh>>>> vvvvooooiiiidddd sssshhhhmmmmeeeemmmm____ccccoooommmmpppplllleeeexxxxdddd____ssssuuuummmm____ttttoooo____aaaallllllll((((ddddoooouuuubbbblllleeee ccccoooommmmpppplllleeeexxxx ****_t_a_r_g_e_t,,,, ddddoooouuuubbbblllleeee ccccoooommmmpppplllleeeexxxx ****_s_o_u_r_c_e,,,, iiiinnnntttt _n_r_e_d_u_c_e,,,, iiiinnnntttt _P_E__s_t_a_r_t,,,, iiiinnnntttt _l_o_g_P_E__s_t_r_i_d_e,,,, iiiinnnntttt _P_E__s_i_z_e,,,, ddddoooouuuubbbblllleeee ccccoooommmmpppplllleeeexxxx ****_p_W_r_k,,,, lllloooonnnngggg ****_p_S_y_n_c))));;;; vvvvooooiiiidddd sssshhhhmmmmeeeemmmm____ccccoooommmmpppplllleeeexxxxffff____ssssuuuummmm____ttttoooo____aaaallllllll((((ffffllllooooaaaatttt ccccoooommmmpppplllleeeexxxx ****_t_a_r_g_e_t,,,, ffffllllooooaaaatttt ccccoooommmmpppplllleeeexxxx ****_s_o_u_r_c_e,,,, iiiinnnntttt _n_r_e_d_u_c_e,,,, iiiinnnntttt _P_E__s_t_a_r_t,,,, iiiinnnntttt _l_o_g_P_E__s_t_r_i_d_e,,,, iiiinnnntttt _P_E__s_i_z_e,,,, ffffllllooooaaaatttt ccccoooommmmpppplllleeeexxxx ****_p_W_r_k,,,, lllloooonnnngggg ****_p_S_y_n_c))));;;; vvvvooooiiiidddd sssshhhhmmmmeeeemmmm____ddddoooouuuubbbblllleeee____ssssuuuummmm____ttttoooo____aaaallllllll((((ddddoooouuuubbbblllleeee ****_t_a_r_g_e_t,,,, ddddoooouuuubbbblllleeee ****_s_o_u_r_c_e,,,, iiiinnnntttt _n_r_e_d_u_c_e,,,, iiiinnnntttt _P_E__s_t_a_r_t,,,, iiiinnnntttt _l_o_g_P_E__s_t_r_i_d_e,,,, iiiinnnntttt _P_E__s_i_z_e,,,, ddddoooouuuubbbblllleeee ****_p_W_r_k,,,, lllloooonnnngggg ****_p_S_y_n_c))));;;; vvvvooooiiiidddd sssshhhhmmmmeeeemmmm____ffffllllooooaaaatttt____ssssuuuummmm____ttttoooo____aaaallllllll((((ffffllllooooaaaatttt ****_t_a_r_g_e_t,,,, ffffllllooooaaaatttt ****_s_o_u_r_c_e,,,, iiiinnnntttt _n_r_e_d_u_c_e,,,, iiiinnnntttt _P_E__s_t_a_r_t,,,, iiiinnnntttt _l_o_g_P_E__s_t_r_i_d_e,,,, iiiinnnntttt _P_E__s_i_z_e,,,, ffffllllooooaaaatttt ****_p_W_r_k,,,, lllloooonnnngggg ****_p_S_y_n_c))));;;; vvvvooooiiiidddd sssshhhhmmmmeeeemmmm____iiiinnnntttt____ssssuuuummmm____ttttoooo____aaaallllllll((((iiiinnnntttt ****_t_a_r_g_e_t,,,, iiiinnnntttt ****_s_o_u_r_c_e,,,, iiiinnnntttt _n_r_e_d_u_c_e,,,, iiiinnnntttt _P_E__s_t_a_r_t,,,, iiiinnnntttt _l_o_g_P_E__s_t_r_i_d_e,,,, iiiinnnntttt _P_E__s_i_z_e,,,, iiiinnnntttt ****_p_W_r_k,,,, lllloooonnnngggg ****_p_S_y_n_c))));;;; vvvvooooiiiidddd sssshhhhmmmmeeeemmmm____lllloooonnnngggg____ssssuuuummmm____ttttoooo____aaaallllllll((((lllloooonnnngggg ****_t_a_r_g_e_t,,,, lllloooonnnngggg ****_s_o_u_r_c_e,,,, iiiinnnntttt _n_r_e_d_u_c_e,,,, iiiinnnntttt _P_E__s_t_a_r_t,,,, iiiinnnntttt _l_o_g_P_E__s_t_r_i_d_e,,,, iiiinnnntttt _P_E__s_i_z_e,,,, lllloooonnnngggg ****_p_W_r_k,,,, lllloooonnnngggg ****_p_S_y_n_c))));;;; vvvvooooiiiidddd sssshhhhmmmmeeeemmmm____lllloooonnnnggggddddoooouuuubbbblllleeee____ssssuuuummmm____ttttoooo____aaaallllllll((((lllloooonnnngggg ddddoooouuuubbbblllleeee ****_t_a_r_g_e_t,,,, lllloooonnnngggg ddddoooouuuubbbblllleeee ****_s_o_u_r_c_e,,,, iiiinnnntttt _n_r_e_d_u_c_e,,,, iiiinnnntttt _P_E__s_t_a_r_t,,,, iiiinnnntttt _l_o_g_P_E__s_t_r_i_d_e,,,, iiiinnnntttt _P_E__s_i_z_e,,,, lllloooonnnngggg ddddoooouuuubbbblllleeee ****_p_W_r_k,,,, lllloooonnnngggg ****_p_S_y_n_c))));;;; vvvvooooiiiidddd sssshhhhmmmmeeeemmmm____lllloooonnnngggglllloooonnnngggg____ssssuuuummmm____ttttoooo____aaaallllllll((((lllloooonnnngggg lllloooonnnngggg ****_t_a_r_g_e_t,,,, lllloooonnnngggg lllloooonnnngggg ****_s_o_u_r_c_e,,,, iiiinnnntttt _n_r_e_d_u_c_e,,,, iiiinnnntttt _P_E__s_t_a_r_t,,,, iiiinnnntttt _l_o_g_P_E__s_t_r_i_d_e,,,, iiiinnnntttt _P_E__s_i_z_e,,,, lllloooonnnngggg lllloooonnnngggg ****_p_W_r_k,,,, lllloooonnnngggg ****_p_S_y_n_c))));;;; vvvvooooiiiidddd sssshhhhmmmmeeeemmmm____sssshhhhoooorrrrtttt____ssssuuuummmm____ttttoooo____aaaallllllll((((sssshhhhoooorrrrtttt ****_t_a_r_g_e_t,,,, sssshhhhoooorrrrtttt ****_s_o_u_r_c_e,,,, iiiinnnntttt _n_r_e_d_u_c_e,,,, iiiinnnntttt _P_E__s_t_a_r_t,,,, iiiinnnntttt _l_o_g_P_E__s_t_r_i_d_e,,,, iiiinnnntttt _P_E__s_i_z_e,,,, sssshhhhoooorrrrtttt ****_p_W_r_k,,,, lllloooonnnngggg ****_p_S_y_n_c))));;;; Fortran: IIIINNNNCCCCLLLLUUUUDDDDEEEE """"mmmmpppppppp////sssshhhhmmmmeeeemmmm....ffffhhhh"""" IIIINNNNTTTTEEEEGGGGEEEERRRR _p_S_y_n_c((((SSSSHHHHMMMMEEEEMMMM____RRRREEEEDDDDUUUUCCCCEEEE____SSSSYYYYNNNNCCCC____SSSSIIIIZZZZEEEE)))) IIIINNNNTTTTEEEEGGGGEEEERRRR _n_r_e_d_u_c_e,,,, _P_E__s_t_a_r_t,,,, _l_o_g_P_E__s_t_r_i_d_e,,,, _P_E__s_i_z_e CCCCAAAALLLLLLLL SSSSHHHHMMMMEEEEMMMM____CCCCOOOOMMMMPPPP4444____SSSSUUUUMMMM____TTTTOOOO____AAAALLLLLLLL((((_t_a_r_g_e_t,,,, _s_o_u_r_c_e,,,, _n_r_e_d_u_c_e,,,, _P_E__s_t_a_r_t,,,, _l_o_g_P_E__s_t_r_i_d_e,,,, _P_E__s_i_z_e,,,, _p_W_r_k,,,, _p_S_y_n_c)))) CCCCAAAALLLLLLLL SSSSHHHHMMMMEEEEMMMM____CCCCOOOOMMMMPPPP8888____SSSSUUUUMMMM____TTTTOOOO____AAAALLLLLLLL((((_t_a_r_g_e_t,,,, _s_o_u_r_c_e,,,, _n_r_e_d_u_c_e,,,, _P_E__s_t_a_r_t,,,, _l_o_g_P_E__s_t_r_i_d_e,,,, _P_E__s_i_z_e,,,, _p_W_r_k,,,, _p_S_y_n_c)))) CCCCAAAALLLLLLLL SSSSHHHHMMMMEEEEMMMM____IIIINNNNTTTT4444____SSSSUUUUMMMM____TTTTOOOO____AAAALLLLLLLL((((_t_a_r_g_e_t,,,, _s_o_u_r_c_e,,,, _n_r_e_d_u_c_e,,,, _P_E__s_t_a_r_t,,,, _l_o_g_P_E__s_t_r_i_d_e,,,, _P_E__s_i_z_e,,,, _p_W_r_k,,,, _p_S_y_n_c)))) CCCCAAAALLLLLLLL SSSSHHHHMMMMEEEEMMMM____IIIINNNNTTTT8888____SSSSUUUUMMMM____TTTTOOOO____AAAALLLLLLLL((((_t_a_r_g_e_t,,,, _s_o_u_r_c_e,,,, _n_r_e_d_u_c_e,,,, _P_E__s_t_a_r_t,,,, _l_o_g_P_E__s_t_r_i_d_e,,,, _P_E__s_i_z_e,,,, _p_W_r_k,,,, _p_S_y_n_c)))) CCCCAAAALLLLLLLL SSSSHHHHMMMMEEEEMMMM____RRRREEEEAAAALLLL4444____SSSSUUUUMMMM____TTTTOOOO____AAAALLLLLLLL((((_t_a_r_g_e_t,,,, _s_o_u_r_c_e,,,, _n_r_e_d_u_c_e,,,, _P_E__s_t_a_r_t,,,, _l_o_g_P_E__s_t_r_i_d_e,,,, _P_E__s_i_z_e,,,, _p_W_r_k,,,, _p_S_y_n_c)))) CCCCAAAALLLLLLLL SSSSHHHHMMMMEEEEMMMM____RRRREEEEAAAALLLL8888____SSSSUUUUMMMM____TTTTOOOO____AAAALLLLLLLL((((_t_a_r_g_e_t,,,, _s_o_u_r_c_e,,,, _n_r_e_d_u_c_e,,,, _P_E__s_t_a_r_t,,,, _l_o_g_P_E__s_t_r_i_d_e,,,, _P_E__s_i_z_e,,,, _p_W_r_k,,,, _p_S_y_n_c)))) CCCCAAAALLLLLLLL SSSSHHHHMMMMEEEEMMMM____RRRREEEEAAAALLLL11116666____SSSSUUUUMMMM____TTTTOOOO____AAAALLLLLLLL((((_t_a_r_g_e_t,,,, _s_o_u_r_c_e,,,, _n_r_e_d_u_c_e,,,, _P_E__s_t_a_r_t,,,, _l_o_g_P_E__s_t_r_i_d_e,,,, _P_E__s_i_z_e,,,, _p_W_r_k,,,, _p_S_y_n_c)))) DDDDEEEESSSSCCCCRRRRIIIIPPPPTTTTIIIIOOOONNNN The shared memory (SHMEM) reduction routines compute one or more reductions across symmetric arrays on multiple virtual PEs. A reduction performs an associative binary operation across a set of values. For a list of other SHMEM reduction routines, see iiiinnnnttttrrrroooo____sssshhhhmmmmeeeemmmm(3). As with all SHMEM collective routines, each of these routines assumes that only PEs in the active set call the routine. If a PE not in the active set calls a SHMEM collective routine, undefined behavior results. The _n_r_e_d_u_c_e argument determines the number of separate reductions to perform. The source array on all PEs in the active set provides one element for each reduction. The results of the reductions are placed in the target array on all PEs in the active set. The active set is defined by the _P_E__s_t_a_r_t, _l_o_g_P_E__s_t_r_i_d_e, _P_E__s_i_z_e triplet. The _s_o_u_r_c_e and _t_a_r_g_e_t arrays may be the same array, but they may not be overlapping arrays. The arguments are as follows: _t_a_r_g_e_t A symmetric array of length _n_r_e_d_u_c_e elements to receive the results of the reduction operations. The data type of _t_a_r_g_e_t varies with the version of the reduction routine being called and the language used. When calling from C/C++, refer to the SYNOPSIS section for data type information. When calling from Fortran, the _t_a_r_g_e_t data types are as follows: RRRRoooouuuuttttiiiinnnneeee DDDDaaaattttaaaa TTTTyyyyppppeeee sssshhhhmmmmeeeemmmm____ccccoooommmmpppp4444____ssssuuuummmm____ttttoooo____aaaallllllll CCCCOOOOMMMMPPPPLLLLEEEEXXXX((((KKKKIIIINNNNDDDD====4444)))). sssshhhhmmmmeeeemmmm____ccccoooommmmpppp8888____ssssuuuummmm____ttttoooo____aaaallllllll Complex. If you are using Fortran, it must be a default complex value. sssshhhhmmmmeeeemmmm____iiiinnnntttt4444____ssssuuuummmm____ttttoooo____aaaallllllll IIIINNNNTTTTEEEEGGGGEEEERRRR((((KKKKIIIINNNNDDDD====4444)))). sssshhhhmmmmeeeemmmm____iiiinnnntttt8888____ssssuuuummmm____ttttoooo____aaaallllllll Integer. If you are using Fortran, it must be a default integer value. sssshhhhmmmmeeeemmmm____rrrreeeeaaaallll4444____ssssuuuummmm____ttttoooo____aaaallllllll RRRREEEEAAAALLLL((((KKKKIIIINNNNDDDD====4444)))). sssshhhhmmmmeeeemmmm____rrrreeeeaaaallll8888____ssssuuuummmm____ttttoooo____aaaallllllll Real. If you are using Fortran, it must be a default real value. sssshhhhmmmmeeeemmmm____rrrreeeeaaaallll11116666____ssssuuuummmm____ttttoooo____aaaallllllll Real. If you are using Fortran, it must be a default real value. _s_o_u_r_c_e A symmetric array, of length _n_r_e_d_u_c_e elements, that contains one element for each separate reduction operation. The _s_o_u_r_c_e argument must have the same data type as _t_a_r_g_e_t. _n_r_e_d_u_c_e The number of elements in the _t_a_r_g_e_t and _s_o_u_r_c_e arrays. _n_r_e_d_u_c_e must be of type integer. If you are using Fortran, it must be a default integer value. _P_E__s_t_a_r_t The lowest virtual PE number of the active set of PEs. _P_E__s_t_a_r_t must be of type integer. If you are using Fortran, it must be a default integer value. _l_o_g_P_E__s_t_r_i_d_e The log (base 2) of the stride between consecutive virtual PE numbers in the active set. _l_o_g_P_E__s_t_r_i_d_e must be of type integer. If you are using Fortran, it must be a default integer value. _P_E__s_i_z_e The number of PEs in the active set. _P_E__s_i_z_e must be of type integer. If you are using Fortran, it must be a default integer value. _p_W_r_k A symmetric work array. The _p_W_r_k argument must have the same data type as _t_a_r_g_e_t. In C/C++, this contains mmmmaaaaxxxx((((_n_r_e_d_u_c_e////2222 ++++ 1111,,,, ____SSSSHHHHMMMMEEEEMMMM____RRRREEEEDDDDUUUUCCCCEEEE____MMMMIIIINNNN____WWWWRRRRKKKKDDDDAAAATTTTAAAA____SSSSIIIIZZZZEEEE)))) elements. In Fortran, this contains mmmmaaaaxxxx((((_n_r_e_d_u_c_e////2222 ++++ 1111,,,, SSSSHHHHMMMMEEEEMMMM____RRRREEEEDDDDUUUUCCCCEEEE____MMMMIIIINNNN____WWWWRRRRKKKKDDDDAAAATTTTAAAA____SSSSIIIIZZZZEEEE)))) elements. _p_S_y_n_c A symmetric work array. In C/C++, _p_S_y_n_c is of type lllloooonnnngggg and size ____SSSSHHHHMMMMEEEEMMMM____RRRREEEEDDDDUUUUCCCCEEEE____SSSSYYYYNNNNCCCC____SSSSIIIIZZZZEEEE. In Fortran, _p_S_y_n_c is of type integer and size SSSSHHHHMMMMEEEEMMMM____RRRREEEEDDDDUUUUCCCCEEEE____SSSSYYYYNNNNCCCC____SSSSIIIIZZZZEEEE. It must be a default integer value. Every element of this array must be initialized with the value ____SSSSHHHHMMMMEEEEMMMM____SSSSYYYYNNNNCCCC____VVVVAAAALLLLUUUUEEEE (in C/C++) or SSSSHHHHMMMMEEEEMMMM____SSSSYYYYNNNNCCCC____VVVVAAAALLLLUUUUEEEE (in Fortran) before any of the PEs in the active set enter the reduction routine. The values of arguments _n_r_e_d_u_c_e, _P_E__s_t_a_r_t, _l_o_g_P_E__s_t_r_i_d_e, and _P_E__s_i_z_e must be equal on all PEs in the active set. The same _t_a_r_g_e_t and _s_o_u_r_c_e arrays, and the same _p_W_r_k and _p_S_y_n_c work arrays, must be passed to all PEs in the active set. Before any PE calls a reduction routine, you must ensure that the following conditions exist (synchronization via a barrier or some other method is often needed to ensure this): * The _p_W_r_k and _p_S_y_n_c arrays on all PEs in the active set are not still in use from a prior call to a collective SHMEM routine. * The _t_a_r_g_e_t array on all PEs in the active set is ready to accept the results of the reduction. Upon return from a reduction routine, the following are true for the local PE: * The _t_a_r_g_e_t array is updated. * The data cache region mapped to _t_a_r_g_e_t is coherent. * The values in the _p_S_y_n_c array are restored to the original values. NNNNOOOOTTTTEEEESSSS The terms _c_o_l_l_e_c_t_i_v_e, _s_y_m_m_e_t_r_i_c, and _c_a_c_h_e _a_l_i_g_n_e_d are defined in iiiinnnnttttrrrroooo____sssshhhhmmmmeeeemmmm(3). All SHMEM reduction routines reset the values in _p_S_y_n_c before they return, so a particular _p_S_y_n_c buffer need only be initialized the first time it is used. You must ensure that the _p_S_y_n_c array is not being updated on any PE in the active set while any of the PEs participate in processing of a SHMEM reduction routine. Be careful of the following situations: * If the _p_S_y_n_c array is initialized at run time, some type of synchronization is needed to ensure that all PEs in the working set have initialized _p_S_y_n_c before any of them enter a SHMEM routine called with the _p_S_y_n_c synchronization array. * A _p_S_y_n_c or _p_W_r_k array can be reused in a subsequent reduction routine call only if none of the PEs in the active set are still processing a prior reduction routine call that used the same _p_S_y_n_c or _p_W_r_k arrays. In general, this can be assured only by doing some type of synchronization. However, in the special case of reduction routines being called with the same active set, you can allocate two _p_S_y_n_c and _p_W_r_k arrays and alternate between them on successive calls. EEEEXXXXAAAAMMMMPPPPLLLLEEEESSSS Example 1: This Fortran example statically initializes the _p_S_y_n_c array and finds the sum of the real variable FFFFOOOOOOOO across all even PEs. INCLUDE "mpp/shmem.fh" INTEGER PSYNC(SHMEM_REDUCE_SYNC_SIZE) DATA PSYNC /SHMEM_REDUCE_SYNC_SIZE*SHMEM_SYNC_VALUE/ PARAMETER (NR=1) REAL FOO, FOOSUM, PWRK(MAX(NR/2+1,SHMEM_REDUCE_MIN_WRKDATA_SIZE)) COMMON /COM/ FOO, FOOSUM, PWRK INTRINSIC MY_PE IF ( MOD(MY_PE(),2) .EQ. 0) THEN CALL SHMEM_INT4_SUM_TO_ALL(FOOSUM, FOO, NR, 0, 1, N$PES/2, & PWRK, PSYNC) PRINT*,'Result on PE ',MY_PE(),' is ',FOOSUM ENDIF Example 2: Consider the following C/C++ call: shmem_int_sum_to_all( target, source, 3, 0, 0, 8, pwrk, psync ); The preceding call is more efficient, but semantically equivalent to, the combination of the following calls: shmem_int_sum_to_all(&(target[0]), &(source[0]), 1, 0, 0, 8, pwrk1, psync1); shmem_int_sum_to_all(&(target[1]), &(source[1]), 1, 0, 0, 8, pwrk2, psync2); shmem_int_sum_to_all(&(target[2]), &(source[2]), 1, 0, 0, 8, pwrk1, psync1); Note that two sets of _p_W_r_k and _p_S_y_n_c arrays are used alternately because no synchronization is done between calls. SSSSEEEEEEEE AAAALLLLSSSSOOOO iiiinnnnttttrrrroooo____sssshhhhmmmmeeeemmmm(3) _M_e_s_s_a_g_e _P_a_s_s_i_n_g _T_o_o_l_k_i_t: _M_P_I _P_r_o_g_r_a_m_m_e_r'_s _M_a_n_u_a_l